Skip to content

PG-1813 Make WAL keys TLI aware #509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 8, 2025

Conversation

dAdAbird
Copy link
Member

@dAdAbird dAdAbird commented Aug 7, 2025

Before this commit, WAL keys didn't mind TLI at all. But after pg_rewind, for example, pg_wal/ may contain segments from two timelines. And the wal reader choosing the key may pick the wrong one because LSNs of different TLIs may overlap. There was also another bug: There is a key with the start LSN 0/30000 in TLI 1. And after the start in TLI 2, the wal writer creates a new key with the SN 0/30000, but in TLI 2. But the reader wouldn't fetch the latest key because w/o TLI, these are the same.

This PR adds TLI to the Internal keys and makes use of it along with LSN for key compares.

It replaces #491. The code is identical + addressed the comment. It was just easier to create a new commit than bother with the rebase of latest changes.

Fixes: https://perconadev.atlassian.net/browse/PG-1813

@codecov-commenter
Copy link

codecov-commenter commented Aug 7, 2025

Codecov Report

❌ Patch coverage is 94.44444% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.14%. Comparing base (8d7192c) to head (1810784).
⚠️ Report is 1 commits behind head on TDE_REL_17_STABLE.

❌ Your project status has failed because the head coverage (82.14%) is below the target coverage (90.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@                  Coverage Diff                  @@
##           TDE_REL_17_STABLE     #509      +/-   ##
=====================================================
+ Coverage              82.00%   82.14%   +0.13%     
=====================================================
  Files                     24       25       +1     
  Lines                   3162     3186      +24     
  Branches                 514      518       +4     
=====================================================
+ Hits                    2593     2617      +24     
  Misses                   460      460              
  Partials                 109      109              
Components Coverage Δ
access 83.16% <92.85%> (+0.34%) ⬆️
catalog 87.61% <ø> (ø)
common 77.77% <ø> (ø)
encryption 72.97% <ø> (-0.48%) ⬇️
keyring 73.21% <ø> (ø)
src 94.15% <ø> (ø)
smgr 95.29% <ø> (ø)
transam ∅ <ø> (∅)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


run_test('remote');

done_testing();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing newline at end of file.

@@ -166,7 +185,8 @@ TDEXLogShmemInit(void)

typedef struct EncryptionStateData
{
XLogRecPtr enc_key_lsn; /* to sync with reader */
XLogRecPtr enc_key_tli;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is not new, but this double EncryptionStateData seems strage to me - postgres doesn't have support for the atomic types in the frontend, or why do we have to duplicate them?

Also, now we also have a sizedifference, tli is 32 bit in the backend code, and 64 in the frontend.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes, frontend doesn't support atomics:
...
#define ATOMICS_H

#ifdef FRONTEND
#error "atomics.h may not be included from frontend code"
#endif
...
  1. My bad, it must be a TimeLineID type. Fixed.

Before this commit, WAL keys didn't mind TLI at all. But after
pg_rewind, for example, pg_wal/ may contain segments from two
timelines. And the wal reader choosing the key may pick the wrong one
because LSNs of different TLIs may overlap. There was also another bug:
There is a key with the start LSN 0/30000 in TLI 1. And after the start
in TLI 2, the wal writer creates a new key with the SN 0/30000, but in
TLI 2. But the reader wouldn't fetch the latest key because w/o TLI,
these are the same.

This commit adds TLI to the Internal keys and makes use of it along
with LSN for key compares.
@dAdAbird dAdAbird merged commit 1a20e9b into percona:TDE_REL_17_STABLE Aug 8, 2025
19 checks passed
@dAdAbird dAdAbird deleted the wal_key_tli branch August 8, 2025 10:32
@dAdAbird dAdAbird mentioned this pull request Aug 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants